Reducing Label Complexity by Learning From Bags

نویسندگان

  • Sivan Sabato
  • Nathan Srebro
  • Naftali Tishby
چکیده

We consider a supervised learning setting in which the main cost of learning is the number of training labels and one can obtain a single label for a bag of examples, indicating only if a positive example exists in the bag, as in MultiInstance Learning. We thus propose to create a training sample of bags, and to use the obtained labels to learn to classify individual examples. We provide a theoretical analysis showing how to select the bag size as a function of the problem parameters, and prove that if the original labels are distributed unevenly, the number of required labels drops considerably when learning from bags. We demonstrate that finding a lowerror separating hyperplane from bags is feasible in this setting using a simple iterative procedure similar to latent SVM. Experiments on synthetic and real data sets demonstrate the success of the approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Complexity of One-class SVM for Multiple Instance Learning

In traditional multiple instance learning (MIL), both positive and negative bags are required to learn a prediction function. However, a high human cost is needed to know the label of each bag—positive or negative. Only positive bags contain our focus (positive instances) while negative bags consist of noise or background (negative instances). So we do not expect to spend too much to label the ...

متن کامل

Multi-instance learning with any hypothesis class

In the supervised learning setting termed Multiple-Instance Learning (MIL), the examples are bags of instances, and the bag label is a function of the labels of its instances. Typically, this function is the Boolean OR. The learner observes a sample of bags and the bag labels, but not the instance labels that determine the bag labels. The learner is then required to emit a classification rule f...

متن کامل

Learnability of the Superset Label Learning Problem

In the Superset Label Learning (SLL) problem, weak supervision is provided in the form of a superset of labels that contains the true label. If the classifier predicts a label outside of the superset, it commits a superset error. Most existing SLL algorithms learn a multiclass classifier by minimizing the superset error. However, only limited theoretical analysis has been dedicated to this appr...

متن کامل

MULTI-INSTANCE LEARNING WITH ANY HYPOTHESIS CLASS Multi-Instance Learning with Any Hypothesis Class

In the supervised learning setting termed Multiple-Instance Learning (MIL), the examples are bags of instances, and the bag label is a function of the labels of its instances. Typically, this function is the Boolean OR. The learner observes a sample of bags and the bag labels, but not the instance labels that determine the bag labels. The learner is then required to emit a classification rule f...

متن کامل

Multiple Instance Metric Learning from Automatically Labeled Bags of Faces

Metric learning aims at finding a distance that approximates a task-specific notion of semantic similarity. Typically, a Mahalanobis distance is learned from pairs of data labeled as being semantically similar or not. In this paper, we learn such metrics in a weakly supervised setting where “bags” of instances are labeled with “bags” of labels. We formulate the problem as a multiple instance le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010